Estimating minimum effect with outlier selection
نویسندگان
چکیده
We introduce one-sided versions of Huber’s contamination model, in which corrupted samples tend to take larger values than uncorrupted ones. Two intertwined problems are addressed: estimation the mean (minimum effect) and selection (outliers). Regarding minimum effect, we derive minimax risks estimators that adaptive with respect unknown number contaminations. The optimal convergence rates differ from ones classical Huber model. This fact uncovers effect structural assumption As for problem selecting outliers, formulate a multiple testing framework location scaling null hypotheses unknown. rigorously prove estimating hypothesis while maintaining theoretical guarantee on amount falsely selected outliers is possible, both through false discovery rate (FDR) post hoc bounds. by-product, address long-standing open issue FDR control under equi-correlation, reinforces interest removing dependency such setting.
منابع مشابه
Outlier Detection for Support Vector Machine using Minimum Covariance Determinant Estimator
The purpose of this paper is to identify the effective points on the performance of one of the important algorithm of data mining namely support vector machine. The final classification decision has been made based on the small portion of data called support vectors. So, existence of the atypical observations in the aforementioned points, will result in deviation from the correct decision. Thus...
متن کاملOutlier Detection by Consistent Data Selection Method
Often the challenge associated with tasks like fraud and spam detection[1] is the lack of all likely patterns needed to train suitable supervised learning models. In order to overcome this limitation, such tasks are attempted as outlier or anomaly detection tasks. We also hypothesize that outliers have behavioral patterns that change over time. Limited data and continuously changing patterns ma...
متن کاملEfficient Spectral Feature Selection with Minimum Redundancy
Spectral feature selection identifies relevant features by measuring their capability of preserving sample similarity. It provides a powerful framework for both supervised and unsupervised feature selection, and has been proven to be effective in many real-world applications. One common drawback associated with most existing spectral feature selection algorithms is that they evaluate features i...
متن کاملDetection of Outlier-Communities using Minimum Spanning Tree
Community (also known as clusters) is a group of nodes with dense connection. Detecting outlier-communities from database is a big desire. In this paper we propose a novel Minimum Spanning Tree based algorithm for detecting outlier-communities from complex networks. The algorithm uses a new community validation criterion based on the geometric property of data partition of the data set in order...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Annals of Statistics
سال: 2021
ISSN: ['0090-5364', '2168-8966']
DOI: https://doi.org/10.1214/20-aos1956